* ; A RECURSIVE LEAST SQUARES APPROACH TO CALCULATE MOTION 

PARAMETERS FOR A MOVING CAMERA 

BACKGROUND 

[0001] The present invention subject matter relates to methods and systems for 
reconstructing 3-dimensional objects from sequences of 2-dimensional images. It finds 
particular application in conjunction with the estimation of camera motion parameters 
from data obtained with a moving camera, and will be described with particular 
reference thereto. However, it is to be appreciated that it is also amenable to other like 
applications. 

[0002] An increase in quality, coupled with a decrease in price of digital camera 
equipment has led to growing interest in reconstructing 3-dimensional objects from 
sequences of 2-dimensional images. Further, the ready availability of high quality and 
low price digital cameras has led to the development of models and systems that allow 
the capture of accurate 3-D spatial data from a sequence of 2-D images. One 
approach has been to collect the sequence of 2-D images from an object space by 
moving the camera along a predetermined path. Using the image sequence and the 
concepts of triangulation and parallax, 3-D spatial data from the object space may be 
recovered. 

[0003] The quality of the 3-D reconstruction of the object space is dependent on 
many factors. Among them are resolution of the sensors used, lighting conditions, the 
object details, and calibration errors. There are several sources that contribute to 
calibration errors such as, e.g., inaccuracies inherent in the camera lens, including 
inaccuracies in the lens specifications, and inaccuracies in the means used to move the 
camera along the desired path. Therefore, it is essential for proper calibration, to 
estimate error introduced by camera movement, and to provide algorithms to remove or 
compensate for this error. This process is referred to hereinafter as moving camera 
calibration. 

[0004] Moving camera calibration consists primarily of two steps. The first step is to 
use a sequence of images to estimate the 3-D motion parameters of a moving camera, 
and the second step is to design a filter to correct the error between the desired motion 
and the estimated motion. The problems of camera parameter estimation have been 



addressed by a wide range of researchers. Their approaches have been successful in 
systems with little or no noise. However, in most cases, the 3-D transformation has 
been modeled as a nonlinear stochastic system, and it is necessary to estimate the 
state variables from noisy observations. 

[0005] There are several sources that contribute to observation noise, including 
camera motion, projection noise, and/or random disturbances from a moving object. To 
solve the observation noise problem, a Kalman filter (IEKF) has been widely used as a 
nonlinear estimator. For example, Denzler and Zobel use a Kalman filter to estimate 
camera parameter with a selected focal length (On optimal camera parameter selection 
in Kalman filter based object tracking, by J. Denzler, M. Zobel and H. Niemann, Pattern 
Recognition, 24th DAGM Symposium, Zurich, Switzerland, pp. 17-25, 2002). Koller 
and Klinker use an extended Kalman filter to estimate the motion of the camera and the 
extrinsic camera parameters {Automated camera calibration and 3D egomotion 
estimation for augmented reality applications by D. Koller, G. Klinker, E. Rose, D. 
Breen, R. Whitaker and M. Tuceryan, 7th International Conference on Computer 
Analysis of Images and Patterns (CAIP-97), Kiel, Germany, September 10-12, 1997). 
Goddard and Abidi use dual quaternion-based iterated extended Kalman filter to 
estimate relative 3-D position and orientation (pose), (Pose and motion estimation using 
dual quaternion-based extended Kalman filtering by J.S. Goddard, M.A. Abidi, A 
Dissertation Presented for the Doctor of Phiosophy Degree, The University of 
Tennessee, Knoxville). Venter and Herbst use an unscented Kalman filter to estimate 
the motion of an object from a video sequence under perspective projection (Structure 
from motion estimation using a non-linear Kalman filter by C. Venter, B. Herbst, Dept. of 
Electronic Engineering and Dept. of Applied Mathematics, University of Stellenbosch, 
7600, South Africa). The above-described references are incorporated herein by 
reference. 

[0006] The accuracy of camera motion models depends primarily on two sets of 
parameter estimates. The first set of parameters includes lens parameters such as, 
e.g., focal length, principal point, and distortion parameters. The second set of 
parameters includes a set of motion parameters that enables the comparison of a 
moving camera's theoretically determined physical location to a desired location. 
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[0007] The present invention is directed toward improving the accuracy of the 
second set of parameters, i.e. the estimation of the set of 3-D motion parameters from 
data obtained with a moving camera. A method is provided that uses Recursive Least 
Squares (RLS) for camera motion parameter estimation with observation noise. This is 
accomplished by calculation of hidden information through camera projection and 
minimization of the estimation error. The present invention also provides a method for 
designing a filter, based on the motion parameters estimates, to correct for errors in the 
camera motion. 

SUMMARY 

[0008] In accordance an exemplary embodiment, there is provided a method for 
estimating camera motion parameters. The method comprises obtaining an 
observation point set including a plurality of observed point vectors, computing a 
plurality of motion output vectors by performing a recursive least squares (RLS) 
process based on a plurality of motion parameter vectors, and comparing the plurality 
of motion output vectors to the plurality of observed point vectors. 
[0009] There is also provided a method for determining a filter to correct for camera 
motion errors. The filter determining method comprises determining a plurality of 
desired motion point vectors, computing a plurality of estimated motion point vectors by 
means of an RLS algorithm, and computing the filter based on a difference between the 
plurality of estimated motion point vectors and the plurality of desired motion point 
vectors. 

[0010] There is further provided a system for estimating and filtering camera motion 
parameters. The system comprises a movable digital camera for generating a plurality 
of 2-dimensional images of a scene or object, a control means for translating and 
rotating the camera along a predetermined trajectory, and a computer system for 
receiving and processing the plurality of 2-dimensional images. The computer system 
includes a user interface for receiving instructions from a user and for providing output 
to a user, an image input means for receiving the plurality of 2-dimensional images from 
the camera, a storage means for storing programs and the plurality of images, a 
program to determine a desired motion, a program to compute an estimated motion by 
means of an RLS program, a program to compute a filter rotation matrix and a filter 
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translation vector based on the difference between the estimated motion and the 
desired motion, and a program to compute a corrected output based on the filter. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0011] The invention may take form in various components and arrangements of 
components, and in various process operations and arrangements of process 
operations. The drawings are only for the purpose of illustrating preferred embodiments 
and are not to be construed as limiting the invention. 

[0012] FIGURE 1 is a flowchart describing the RLS algorithm without noise; 
[0013] FIGURE 2 is a flowchart describing the RLS algorithm with random noise; 
[0014] FIGURE 3 is an exemplary system for performing validation experiments 
according to the present invention; 

[001 5] FIGURE 4 is a plot of an exemplary observed point trajectory compared to a 
desired point trajectory; 

[0016] FIGURE 5 is a plot of exemplary rotation variables for the RLS algorithm; 
[0017] FIGURE 6 is a plot of exemplary translation and noise variables for the RLS 
algorithm; 

[0018] FIGURE 7 is a plot of exemplary parameter and state estimate errors for the 
RLS algorithm; 

[0019] FIGURE 8 is a 3-D graph showing a corrected point trajectory, an observed 
point trajectory, and a desired point trajectory; and 

[0020] FIGURE 9 is a system incorporating an embodiment of the present invention. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENT(S) 
[0021] A preferred embodiment of the present invention uses a recursive least 
squares algorithm to estimate position and orientation of a moving camera that is 
subject to observation noise. The estimation of the 3-D motion parameters is obtained 
by identifying certain object features, such as, for example, a set of points or a line, that 
are located and matched in the images in the sequence taken by the camera. In the 
preferred embodiment, motion calibration is done by locating a set of feature points in 
the images. A moving camera and a static object is assumed herein, however, it is 
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readily apparent that this is equivalent to estimating the 3-D motion of a moving object 
observed by a static camera. 

[0022] The motion of a camera can be described as the motion of a point in 3-D 
space. Additionally, in a dynamic system, the motion can be described by evolution of a 
set of state variables, preferably including three rotation parameters and three 

translation parameters. If X = [x } x 2 x, J is a point on a curve that describes the 

motion of the camera, subsequent points on the curve may be obtained by rotating X 
by a small angle a about an axis, e.g., A , and then by translating the rotated point by 

a translation vector, e.g., T = [t x t y t 2 ], where t x , t y , and t z are the above-described 

translation parameters. In vectors and matrices provided herein, the prime notation 
indicates a matrix transpose. 

[0023] To determine the three rotation parameters, as widely known in the art, Euler 
angles, a x , a y , and a 2 , are calculated for the axis A . Rotation about A by the angle 

a is then equivalent to rotating about the Z -axis by a Zf then rotating about the Y -axis 
by a yt and then rotating about the X -axis by a x . The rotation matrix for rotating about 
the axis A is the product of 3 rotation matrices about the coordinate axes as just 
described. Hence the model for rotation followed by translation can be written as: 

X t+l =RX,+T (1) 

where 



R = 



1 TU X T& 2 

-m ] 1 G7 3 
— E7 2 — C7 3 1 



(2) 



The variables gt,, m 2 and gj 3 of the cumulative rotation matrix R are derived from the 

Euler angles described above and, along with t x , t y% and t 2 , form a set of 6 motion 

parameters. It is to be appreciated by one skilled in the art that other 3D rotation 
matrices exist, and the present invention is not limited to the above-described rotation 
matrix. 

[0024] Having established the form of the equation of motion, to estimate the 
camera motion parameters, the matrices R and 7 from Equation (1 ) are now identified. 
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For these purposes, Equation (1 ) can be rewritten as in the following linear regression 
model: 

y,=M (3) 

where 

y = [*i x 2 ^ ( 4 ) 



Xj x 2 x 3 0 10 0 
x 2 -x, 0 x 3 0 1 0 
L x 3 0 -x, -x 2 0 0 1 



(5) 



0 = [l w x m 2 gt 3 t x t y t 2 ] (6) 
[0025] Components of the matrix <f> consist of the state variables x, , x 2 and x 3 . 

These state variables can be determined to certainty or they may be subject to noise. 
The vector 9 essentially consists of the six above-described motion parameters and, 
therefore, 9 is herein referred to as the parameter vector. 

[0026] The model provided in Equation (3) describes the observed variable y t 
as an unknown linear combination of the components of the observed vector <f> t . 
Assuming the data acquisition takes place in discrete time, then, at any time£, a 
sequence of measurements y X9 y 29 -- 9 y t is received. A data vector / may then be 
defined by: 

y=U JVi - ^i] ( 7 ) 
[0027] The primary goal of motion parameter identification is to obtain a model of the 
system such that the model has the same output as the observed data. The model is 
parameterized by an unknown parameter vector 9 . Therefore, the identification 
problem can be formulated as the determination of a transformation T from data y* to 
the model parameter 9 . The transformation may be represented symbolically as: 

/-* M (8) 

[0028] In practice, 9 may never be known, however, an estimate of 9 can be 
obtained from information contained in y* . Denoting a guess, or a prediction, of 9 by 
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0, the corresponding transformation and estimate of system output are based on the 
following equations: 

y'-*t M 0) 

and 

y m = W (10) 

[0029] Since there are limitations to collecting data in a finite time, the above 
procedure preferably terminates in some finite number of steps, e.g., N . Symbolically, 
this may be written as: 

[0030] Because 0 is an estimate of 6 based on information contained in y l , it is, in 
general, a function of previous y data and it depends on the previous value of 
y k (k = l y ---,t). However, in practice, computation time and memory space are restricted 
to an auxiliary memory space S t of fixed dimensions. This auxiliary vector is updated 
according to the following algorithm: 

9 t =F{0 t _ x ,S n y) (12) 

where 

S t =H{s t _J t _ x ,y) (13) 

[0031] f(- • •) and #(•••) of the above equations are functions described in further 

detail below. From Equations (12) and (13), it is seen that the estimate 6 is a function 
of current data, a previous estimate of the data, and an auxiliary variable 

Preferably, the only information that is stored at time t is 6 and The problem of 

recursive identification therefore is reduced to a suitable model parameterization and to 
appropriately choosing the functions F(- ) and #(■••). In order to choose the functions, 
if the error between the estimated output and the measurement is defined by e t , then 

e< =y<-y<\e ( 14 ) 

or 

e t =y t -W (15) 
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[0032] A suitable approach is to find an estimate of f t such that the corresponding 
error e t is minimal. A criterion function, denoted by V^y is defined as a sum of the 
quadratic of the error, that is 

%r L 2 p < 16 > 

[0033] Defining 

P.-tlMV (17) 

it can be observed from Equations (16) and (17) that one may apply the recursive least 
squares (RLS) algorithm 

where 

K t -p t j[i^:pM~ x (20) 

with initial conditions of P 0 positive, and 0 O are given. 

[0034] Functions F(- ••) and #(•••) are now identified, and, with reference to FIG. 1, 
the RLS algorithm for the calibration is now described. At step 10, the point set from 
observation, X ik (i = l,m;k = l,n), is obtained, where m is the number of points in the 

point set and n is the number of observations. At step 12, an iteration is initialized by 
setting P 0 positive, *=1, and 0 Q =[l 0 0 0 0 0 o/ . 

[0035] Within the iteration, starting at step 14, the covariance matrix P t is computed 

and, at step 16, the estimate parameter vector 9 t is computed. The motion output is 
then computed at step 18, based on step 14 above. The motion output from is then 
compared with X t at step 20, and steps 14-20 are repeated for t < n (steps 22, 24). 

[0036] In practice, the individual state variables of a dynamic system cannot be 
determined exactly by direct measurements. Instead, the measurements are functions 
of the state variables with random noise. The system itself may also be subject to 
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random disturbances. In this case, the state variables are estimated from noisy 
observations. 

[0037] A moving camera with noise can be modeled by adding a noise vector 
e = [e, e 2 e 3 ] defining the rotation and translation noise, where e l , e 2 , e 3 are 
independent normally distributed random noise. Since the noise is additive, a 
transformation matrix c with diagonal entries c, , c 2 , c 3 may be defined. The model for 
rotation with translation and with noise can then be written as: 



X l+l =RX,+T+ce 



c = 



0 0 



0 



0 0 c 7 



(21) 
(22) 



[0038] Equation (22) can be written in the same format as the model in Equation (3) 
as follows: 



where 



X = [x x 



x 2 x 3 ^ 



X, 


x 2 


x 2 


-*1 


x 3 


0 



x 3 0 1 0 0 e, 0 0 
0 x 3 0 1 0 0 e 2 0 
0 0 1 0 0 e, 



0 - x, - x 



(23) 
(24) 
(25) 
(26) 



9' = [l ex, m 2 cj 3 t x t y t z c, c 2 c 3 ] 

[0039] Equations (19) and (20) can be applied to Equation (23), and a recursive 
least squares solution can be obtained for the model with noise as follows now with 
reference to FIG. 2. The point set from observation, X ik (i = \,m\k = 1,«), is obtained 

at step 30, where m is the number of points in the point set and n is the number of 
observations. At step 32, an iteration is initialized by setting P 0 positive, f=l, and 

0 O =[1 00000000 of . At step 34, within the iteration, the covariance 
matrix P t is computed, a random noise vector e = [e x e 2 e 3 ] is computed at step 
36, and at step 38, the estimate parameter vector 0 t is computed including a 
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transformation of the random noise. At step 40, the motion output is computed 
based on step 34 above, and, at step 42, the motion output is compared with X t . 
Steps 34-42 are repeated for t < n (steps 44, 46). 

[0040] The design of a filter, in the preferred embodiment, is based on the 
differences between parameters generating the desired motion and parameters 
estimated based on the motion measurements. These differences form an additional 
dynamic system. The system provides a compensation transformation to bring the 
motion closer to the desired trajectory. Designating the desired rotation matrix by R d 

and the desired translation vector by T d , the desired motion of the camera is described 

by 

X (+l =R d X,+T d (27) 

Further designating the estimated rotation matrix by R e , the estimated translation 
vector by T e , and the noise transformation matrix c e , the estimated motion is 
described by 

X t+l =R e X t +T e +c e e (28) 

[0041] The rotation matrix of the filter is the difference between the estimated 
rotation matrix and desired rotation matrix, and the translation vector of the filter is the 
difference between the estimated translation vector and the desired translation vector. 
Denoting the filter rotation matrix by R f and the filter translation vector by T f , then 

R f =R e - R d and T f =T e -T d . These filter matrices are the filter state parameters, and 

the sequence of images taken by the camera provide the input to the filter. Using the 
image input, corrected outputs are generated by the following motion model: 

X (+l =R f X,+T f +c e e (29) 

[0042] As an illustration, consider a camera that is designed to move along a 
straight line determined by the X -axis, with a constant camera orientation. Then an 
effective system should identify, and filter, any non-zero rotation and translation 
parameters along the Y -axis and the Z -axis. 

[0043] With reference now to FIG. 3, a setup is shown in order to demonstrate how 
an exemplary camera motion system is set up and how the camera motion is 
measured. A black and white checkerboard 50 is used as the subject for the set of 
digital images that are input to the system. The data are collected as the camera 52 
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moves along a slider 54. The path of the slider represents the predetermined path for 
the system. 

[0044] A camera that is allowed to move from one end of a slider to the other is said 
to be in straight-line motion. In this case, the slider 54 is set up so that it is parallel to 
the checkerboard 50, and it is assumed that the camera sensor 56 is parallel to the 
checkerboard 50. 

[0045] Assuming the slider 54 represents the X -axis in camera coordinates, the 
camera moves from one end of the slider 54 to the other in a set of discrete steps. At 
each step, an image is taken of the checkerboard 50. In this example shown, the step 
size is 2 mm, and 201 images are collected. Each collected image contains a full view 
of the checkerboard. A corner detection algorithm is used to determine the corner 
positions 58, 60, 62 and 64 on the checkerboard 50 in each collected frame. A simple 
matching algorithm is used to match the corners from frame to frame. A sequence of 
points determined by the image sequence forms an observed trajectory, and the points 
over the slider form a desired point trajectory. One particular experiment yielded the 
results shown in FIG. 4. Compared with the desired point trajectory 66, it is seen that 
there is an error of about 3 pixels along the Y -axis 68, and there is an error of about 8 
pixels along the Z -axis 70 for the observed trajectory 72. 

[0046] The ability of the RLS algorithm to perform camera motion parameter 
estimation is now demonstrated. A sequence of 3-D points from the observed point 
sequence 72 in FIG. 4 is used as the observed values for Equation (19). Values as 
large as 10 12 are used for the initial value of the covariance matrix P 0 . The initial 

estimate parameter vector 0 O is set to [l 00000000 o]' . The initial noise 

matrix is initialized as c (1I) =0.05, c (2 2) = 0.06 and c (3 3) = 0.05. A set of 201 unit random, 

white, noisy points, together with the parameter vector of Equation (27) and the state 
matrix of Equation (26) are used. The RLS algorithm is allowed to continue through 200 
iterations ( n = 200 ). 

[0047] With reference to FIG. 5, the resulting rotation variables 74 are plotted versus 
the iteration count 76 for each image frame used in the RLS algorithm. Likewise, with 
reference to FIG. 6, translation variables 78 and noise variables 80 are plotted against 
the iteration count 76. 



11 



[0048] It can be readily observed that the estimation of the parameter vector 
reaches a stable value after about 20 iterations. A set of the computed estimation is 
obtained and the error of the parameter 82 and state estimation 84 are plotted in FIG. 
7. The final computed estimation parameters are: 

Estimated rotation variables: 0.0000 -0.0004 -0.0008 
Estimated translation variables: 2.2705 1.2526 -0.6351 
Estimated noise variables: 0.0211 0.0144 0.0046 
[0049] After the error parameter estimates are made, they are used to construct a 
model of the motion and to compare it to the observed point trajectory. The graph 
shows good agreement between the predicted motion 86 and the actual motion 88. 
[0050] A filter based on the difference between the desired motion and estimated 
motion is now constructed, and the ability of the filter to correct the camera motion is 
demonstrated. Estimated rotation and translation parameters are used to calculate the 
difference from the desired motion. The estimated motion forms a rotation matrix and a 

1 0 -0.0004" 

translation vector with R = 



0 1 -0.0008 

0.0004 0.0008 1 



and 



T = [2.2705 1.2526 - 0.635 1]\ 

[0051] The desired motion is linear (along the X-axis) with no rotation. Therefore, the 
desired rotation matrix is the 3x3 identity matrix / . A rotation matrix of R - 1 is used to 

rotate the observed point and a translation vector of T - T d , where T d = [f A 0 0] , and 

is the desired moment for each step in pixels, is used to translate the observed 

point to the corrected point. With reference now to FIG. 8, it can be observed that the 
filter outputs of the corrected point form a motion trajectory 90 close to the desired point 
trajectory 92 compared with the observed point trajectory 94. In this experiment, the 
error between two trajectories is defined as the distance between two corresponding 
points from the two trajectories. The average of error between the filter output trajectory 
90 and the desired output trajectory 92 is approximately 0.27 pixel compared with 
approximately 2.5 pixel of error between the observed trajectory 94 and the desired 
output trajectory 92. The maximum of error between the filter output trajectory 90 and 
the desired output trajectory 92 is approximately 0.45 pixel compared with 
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approximately 7.3 pixel of error between the observed trajectory 94 and the desired 
output trajectory 92. 

[0052] With reference now to FIG. 9, an exemplary system, suitable for incorporating 
embodiments of the present invention is shown. A digital camera 100, including a 
lens/sensor 102 is movably mounted on a track 104. A camera control and movement 
means 106, e.g., an electric motor, drive system, etc., is provided to control movement 
and rotation of the camera along the track 104 which, in this exemplary embodiment, 
comprises the X -axis. It is to be appreciated, however, that the track 104, and the 
movement means 106 are shown only for purposes of understanding the present 
invention. The movement means may just as well comprise a moving vehicle, such as 
an automobile, and the motion may be allowed 3-dimensionally, along 3 axes, not 
constrained to a single linear axis. The figure also shows an object 108 which, similarly, 
may be a room, or even an outdoor scene. In the case of a room, or a confined outdoor 
area, the track 104 movement means 106 may be a cable system supporting the 
camera, thereby enabling controlled motion of the camera in an essentially planar 
fashion. The object 108, however, is useful for testing and calibration of the system as 
previously described. 

[0053] A computer system 1 1 0 is shown connected to the camera by communication 
means 112 which may be a network, cable or wireless connection. While the computer 
system 1 10 is shown in the figure as being connected directly to the camera, it is to be 
appreciated that the camera 100 may be used to accumulate images in an offline 
environment, wherein the images are then transferred to the computer system 1 10 for 
processing at a later time. 

[0054] The computer system 110 includes a user interface 114 and a storage means 
116 for storing programs and image data collected from the camera 100. The storage 
means may comprise a random access disk system, tape storage devices and/or 
random access solid-state memory devices. An image input program 118 is provided 
for receiving image input from the camera 100, and image processing programs 120 
are further provided for processing the images, and storing the images on the storage 
means 116. An RLS program 122 is provided which is programmed to perform the 
above-describe RLS algorithm, including calibration, modeling camera motion without 
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noise, modeling camera motion with noise, and correcting an observed trajectory by 
means of the above-described filter. 

[0055] The invention has been described with reference to the preferred 
embodiments. Modifications and alterations will occur to others upon a reading and 
understanding of the specification. It is our intention to include all such modifications 
and alterations insofar as they come within the scope of the appended claims, or the 
equivalents thereof. 
[0056] What is claimed is: 
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